Search CORE

126 research outputs found

PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning

Author: Feng Shanshan
Li Xutao
Lin Huiwei
Ye Yunming
Zhang Baoquan
Publication venue
Publication date: 10/04/2023
Field of study

Online class-incremental continual learning is a specific task of continual learning. It aims to continuously learn new classes from data stream and the samples of data stream are seen only once, which suffers from the catastrophic forgetting issue, i.e., forgetting historical knowledge of old classes. Existing replay-based methods effectively alleviate this issue by saving and replaying part of old data in a proxy-based or contrastive-based replay manner. Although these two replay manners are effective, the former would incline to new classes due to class imbalance issues, and the latter is unstable and hard to converge because of the limited number of samples. In this paper, we conduct a comprehensive analysis of these two replay manners and find that they can be complementary. Inspired by this finding, we propose a novel replay-based method called proxy-based contrastive replay (PCR). The key operation is to replace the contrastive samples of anchors with corresponding proxies in the contrastive-based way. It alleviates the phenomenon of catastrophic forgetting by effectively addressing the imbalance issue, as well as keeps a faster convergence of the model. We conduct extensive experiments on three real-world benchmark datasets, and empirical results consistently demonstrate the superiority of PCR over various state-of-the-art methods.Comment: To appear in CVPR 2023. 10 pages, 8 figures and 3 table

arXiv.org e-Print Archive

MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning

Author: Feng Shanshan
Li Xutao
Ye Rui
Ye Yunming
Zhang Baoquan
Publication venue
Publication date: 06/12/2021
Field of study

Few-Shot Learning (FSL) is a challenging task, \emph{i.e.}, how to recognize novel classes with few examples? Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then predicting novel classes via a cosine nearest neighbor classifier with mean-based prototypes. Nevertheless, due to the data scarcity, the mean-based prototypes are usually biased. In this paper, we attempt to diminish the prototype bias by regarding it as a prototype optimization problem. To this end, we propose a novel meta-learning based prototype optimization framework to rectify prototypes, \emph{i.e.}, introducing a meta-optimizer to optimize prototypes. Although the existing meta-optimizers can also be adapted to our framework, they all overlook a crucial gradient bias issue, \emph{i.e.}, the mean-based gradient estimation is also biased on sparse data. To address the issue, we regard the gradient and its flow as meta-knowledge and then propose a novel Neural Ordinary Differential Equation (ODE)-based meta-optimizer to polish prototypes, called MetaNODE. In this meta-optimizer, we first view the mean-based prototypes as initial prototypes, and then model the process of prototype optimization as continuous-time dynamics specified by a Neural ODE. A gradient flow inference network is carefully designed to learn to estimate the continuous gradient flow for prototype dynamics. Finally, the optimal prototypes can be obtained by solving the Neural ODE. Extensive experiments on miniImagenet, tieredImagenet, and CUB-200-2011 show the effectiveness of our method.Comment: Accepted by AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Magnesia-stabilised zirconia solid electrolyte assisted electrochemical investigation of iron ions in the SiO2-CaO-MgO-Al2O3 molten slag at 1723 K

Author: Chen George Z.
Gao Yunming
Qin Qingwei
Yang Chuanghuang
Zhang Canlei
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 28/06/2017
Field of study

Production of metallic iron through molten oxide electrolysis using inert electrodes is an alternative route for fast ironmaking without CO2 emissions. The fact that many inorganic oxides melt at ultrahigh temperatures (>1500 K) challenges conventional electro-analytical techniques used in aqueous, organic and molten salt electrolytes. However, in order to design a feasible and effective electrolytic process, it is necessary to best understand the electrochemical properties of iron ions in molten oxide electrolytes. In this work, a magnesia-stabilised zirconia (MSZ) tube with a closed end was used to construct an integrated three-electrode cell with the “MSZ | Pt | O2 (air)” assembly functioning as the solid electrolyte, the reference electrode and also the counter electrode. Electrochemical reduction of iron ions was systematically investigated on an iridium (Ir) wire working electrode in the SiO2-CaO-MgO-Al2O3 molten slag at 1723 K by cyclic voltammetry (CV), square wave voltammetry (SWV), chronopotentiometry (CP) and potentiostatic electrolysis (PE). The results show that the electro-reduction of the Fe2+ ion to Fe on the Ir electrode in the molten slag follows a single two-electron transfer step, and the rate of the process is diffusion controlled. The peak current on the obtained CVs is proportional to the concentration of the Fe2+ ion in the molten slag and the square root of scan rate. The diffusion coefficient of Fe2+ ions in the molten slag containing 5 wt% FeO at 1723 K was derived to be (3.43 ± 0.06)×10-6 cm2 s-1 from CP analysis. However, a couple of following processes, i.e. alloy formation on the Ir electrode surface and interdiffusion were found to affect the kinetics of iron deposition. An ECC mechanism is proposed to account for the CV observations. The findings from this work confirm that zirconia-based solid electrolytes can play an important role in electrochemical fundamental research in high temperature molten slag electrolytes

Nottingham ePrints

Nottingham eTheses

Repository@Nottingham

Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code

Author: Akkas Abdurrahman
Amarasinghe Saman
Baghdadi Riyadh
Del Sozzo Emanuele
Kamil Shoaib
Ray Jessica
Romdhane Malek Ben
Suriana Patricia
Zhang Yunming
Publication venue
Publication date: 20/12/2018
Field of study

This paper introduces Tiramisu, a polyhedral framework designed to generate high performance code for multiple platforms including multicores, GPUs, and distributed machines. Tiramisu introduces a scheduling language with novel extensions to explicitly manage the complexities that arise when targeting these systems. The framework is designed for the areas of image processing, stencils, linear algebra and deep learning. Tiramisu has two main features: it relies on a flexible representation based on the polyhedral model and it has a rich scheduling language allowing fine-grained control of optimizations. Tiramisu uses a four-level intermediate representation that allows full separation between the algorithms, loop transformations, data layouts, and communication. This separation simplifies targeting multiple hardware architectures with the same algorithm. We evaluate Tiramisu by writing a set of image processing, deep learning, and linear algebra benchmarks and compare them with state-of-the-art compilers and hand-tuned libraries. We show that Tiramisu matches or outperforms existing compilers and libraries on different hardware architectures, including multicore CPUs, GPUs, and distributed machines.Comment: arXiv admin note: substantial text overlap with arXiv:1803.0041

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Politecnico di Milano